Using Cohesion and Coherence Models for Text Summarization

نویسندگان

  • Inderjeet Mani
  • Eric Bloedorn
چکیده

In this paper we investigate two classes of techniques to determine what is salient in a text, as a means of deciding whether that information should be included in a summary. We introduce three methods based on text cohesion, which models text in terms of relations between words or referring expressions, to help determine how tightly connected the text is. We also describe a method based on text coherence, which models text in terms of macro-level relations between clauses or sentences to help determine the overall argumentative structure of the text. The paper compares salience scores produced by the cohesion and coherence methods and compares them with human judgments. The results show that while the coherence method beats the cohesion methods in accuracy of determining clause salience, the best cohesion method can reach 76% of the accuracy levels of the coherence method in determining salience. Further, two of the cohesion methods each yield significant positive correlations with the human salience judgments. We also compare the types of discourse-related text structure discovered by cohesion and coherence methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches

Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...

متن کامل

Cohesion and coherence for Automatic Summarization

This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting s...

متن کامل

Integrating cohesion and coherence for Automatic Summarization

This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting s...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

Generating Indicative-Informative Summaries with SumUM

s are texts used in tasks such as assessing the content of the document and deciding if the source is worth reading. If text summarization systems are designed to fulfil those requirements, the quality of the generated texts has to be evaluated according to their intended function. The quality of human-produced abstracts has been examined in the literature (Grant, 1992; Kaplan et al., 1994; Gib...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998